2010  Monitoring  Research  Review:  Ground-Based  Nuclear  Explosion  Monitoring  Technologies 


Seismic  Velocity  Estimation  in  the  Middle  East  from  Multiple  Waveform 
Functionals:  P  4&  S  Receiver  Functions,  Waveform  Fitting, 

AND  Surface  Wave  Dispersion 


Mrinal  K.  Sen\  Jay  Pulliam^,  Utpal  Dutta^,  Ranjana  Ghosh\  Rengin  Gok"^,  and  Michael  Pasyanos"^ 

The  University  of  Texas  at  Austin^  Baylor  University^,  University  of  Alaska,  Anchorage^, 
and  Lawrence  Livermore  National  Laboratory"^ 


Sponsored  by  the  National  Nuclear  Security  Administration 


Award  Nos.  DE-AC52-09NA29327  and  DE-AC52-07NA27344 
Proposal  No.  BAA09-58 


ABSTRACT 

We  demonstrated  that  SPE  waveform  data,  together  with  associated  phases,  are  able  to  constrain  crustal  and  upper 
mantle  seismic  wave  velocity  structures.  Forward  modeling  is  done  using  a  reflectivity  algorithm  and  optimization  is 
carried  out  using  an  algorithm  called  very  fast  simulated  annealing  (VESA).  The  code  runs  efficiently  on  multiple 
processors  using  MPI  routines.  The  advantages  of  VESA  include 

•  It  is  derivative  free  and  therefore,  the  objective  function  can  be  changed  to  include  disparate  datasets 
without  any  changes  in  the  algorithm,  and 

•  Models  from  multiple  VESA  runs  can  be  used  to  obtain  estimates  of  uncertainty. 

We  are  currently  developing  a  new  algorithm  to  derive  crustal  velocity  structure  by  modeling  SPE  wavetrains, 
surface  wave  dispersion  and  receiver  function  data  from  collocated  source-receiver  pairs.  Optimization  is  done  using 
a  three  part  objective  function  given  by 

Error  =  a,  ||  11  +a,  ||  d'’^*  - d*^"  11  +a,  ||  d'’^"  - d*^"  11  ,  (1) 

^  SPL  SPL  'P  ^  dispersion  dispersion  'P  ^  receiverfucntion  receiverfucntion 

where  ai,  ^2  and  are  the  relative  weights  of  different  data  and  p  is  the  norm  used  to  measure  data  misfit.  The 
algorithm  is  very  general  in  that  the  weights  can  change  with  iteration  and  p  can  take  different  values  for  the  three 
different  parts  of  the  objective  function.  Note  that  the  same  model  is  used  to  compute  synthetic  SPL  waveform, 
surface  waves  and  receiver  function  data. 

Since  SPL  calculation  takes  the  longest,  the  reflectivity  calculation  is  distributed  over  ray-parameters  using  MPI 
calls.  On  the  other  hand,  one  node  each  is  allocated  to  compute  surface  wave  dispersion  and  receiver  function, 
resulting  in  efficient  load  balancing.  The  following  issues  are  currently  being  investigated: 

•  Types  of  objective  functions  to  be  used, 

•  Values  of  weights  assigned  to  different  data  types, 

•  Influence  of  different  datasets  on  the  derived  model  estimates. 

The  algorithm  will  be  applied  to  data  recorded  at  broadband  seismic  stations  in  the  Middle  East,  an  area  of  strategic 
interest  and  one  for  which  there  exist  previous  results  of  surface  wave  dispersion  and  receiver  function  m  modeling. 
We  have  developed  statistical  tools,  including  estimates  of  the  Posterior  Probability  Density  (PPD)  functions  and 
parameter  correlation  matrices,  which  allow  us  to  evaluate  the  uniqueness  and  physical  feasibility  of  resulting 
models.  These  tools  will  allow  us  to  evaluate  the  relative  strength  of  constraints  placed  on  model  parameters  by  each 
type  of  data  in  addition  to  the  portions  of  the  model  that  are  well-  or  poorly  constrained. 
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OBJECTIVES 

One  goal  of  the  BAA  to  which  this  proposal  responds  is  to  develop  optimal  procedures  for  the  use  of  multiple 
datasets.  Due  to  the  inherent  variability,  inconsistency,  and  peculiarities  of  disparate  datasets  and  the  well-known 
nonlinearity  and  non-uniqueness  associated  with  geophysical  modeling,  such  procedures  must  include  methods  for 
evaluating  the  performance  and  contribution  of  each  dataset  to  the  final  results. 

Our  approach  makes  use  of  quantitative  assessment  tools  and  a  well-developed  Bayesian  approach  to  explore  and 
evaluate  each  step  of  the  modeling  process,  rather  than  to  simply  toss  all  constraints  into  a  simultaneous  fitting 
procedure  to  find  the  single  solution  that  satisfies  particular  criteria.  The  procedure  we  propose,  best  characterized  as 
velocity  analysis  via  optimization,  is  analogous  to  velocity  analysis  in  exploration  seismology,  rather  than 
“inversion”.  It  will  provide  quantitative  error  measures  of  structural  parameter  estimates  that  can  then  be  translated 
to  earthquake  location  errors,  and  thus  guide  seismologists  toward  the  most  effective  and  efficient  ways  to  improve 
model  reliability. 

RESEARCH  ACCOMPLISHED 

We  have  made  substantial  progress  in  developing,  testing,  and  applying  the  method  for  waveform  modeling  via 
global  optimization  described  above  (Gangopadhyay  et  al.,  2007;  Pulliam  and  Sen,  2005;  Pulliam  et  al.,  2006). 
Earlier  we  developed,  with  Lian-She  Zhao  and  Cliff  Frohlich,  a  VESA-based  inversion  algorithm  for  P-wave 
receiver  functions  (Zhao  and  Frohlich,  1996;  Zhao  et  al.,  1996).  Additional  tools  we  developed  include  a  fast, 
parallelized  reflectivity  code  that  incorporates  anisotropic  model  parameters,  additional  global  optimization  methods 
based  upon  genetic  algorithms,  and  numerous  imaging  algorithms. 

Specifically  with  regard  to  waveform  modeling:  We  developed  a  code  that  computes  synthetic  seismograms  with  a 
parallelized  reflectivity  method  and  fits  the  observed  waveforms  by  global  optimization.  Our  method  also  computes 
the  Posterior  Probability  Density  (PPD)  function  and  correlation  matrix,  to  evaluate  the  uniqueness  of  the  resulting 
models,  and  the  trade-offs  between  individual  model  parameters  therein.  We  applied  the  code  to  determine  the  crust 
and  upper  mantle  structure  beneath  permanent  broadband  seismic  stations  in  Africa  using  teleseismic  earthquakes 
(M  5. 5-7.0  and  200-800  km  focal  depth)  recorded  at  these  stations.  We  modeled  the  S,  SP,  SsPmP,  and  shear- 
coupled  PL  waves  from  these  earthquakes  and  our  P-  and  S-wave  velocity  models  compare  well  with,  and  in  some 
cases  improve  upon  the  models  obtained  from  other  existing  methods.  We  obtained  P-  and  S-wave  velocities 
simultaneously  and  our  use  of  the  shear-coupled  PL  phase  wherever  available  improved  constraints  on  the  models  of 
the  lower  crust  and  upper  mantle(Gangopadhyay  et  al.,  2007). 

During  the  past  year,  we  have  accomplished  the  following  tasks  listed  below.  Later  we  describe  these  in  detail. 

1 .  testing  of  waveform,  surface  wave  dispersion  and  receiver  function  modeling  codes  using  identical  inputs, 

2.  development  of  a  joint  inversion  code  using  all  three  different  types  of  data  sets, 

3.  extensive  testing  of  objective  functions  appropriate  for  each  data  set, 

4.  efficient  parallelization  of  the  joint  inversion  code, 

5.  extensive  testing  with  synthetic  data,  and 

6.  preliminary  tests  with  a  data  set  from  the  middle  east. 

Joint  Modeling  of  Multiple  Datasets 

There  are  several  advantages  to  jointly  modeling  multiple  datasets.  First,  each  data  functional  has  unique 
sensitivities  to  Earth  structure.  For  example,  receiver  functions  are  primarily  sensitive  to  shear  wave  velocity 
contrasts  and  vertical  traveltimes  while  surface  wave  dispersion  measurements  are  sensitive  to  vertical  shear  wave 
velocity  averages  (Julia  et  al.,  2000).  Full  waveform  modeling  of  S,  Sp,  SsPmP  phases,  in  contrast,  are  more 
sensitive  compressional  wave  velocity  contrasts  and  vertical  traveltimes;  adding  SPL  to  the  modeling  improves 
senstivity  to  the  uppermost  mantle  shear  wave  structure  and  to  velocity  contrasts  across  the  Moho  (Gangopadhyay  et 
al.,  2007).  The  amplitudes  and  signal-to-noise  characteristics  of  waves  that  produce  these  data  functionals  depend  on 
several  factors,  including  epicentral  distance,  event  focal  depth,  fault  mechanism  and  radiation  patterns,  source  time 
function,  properties  of  the  intervening  Earth  structure  (including  attenuation,  low  velocity  zones,  velocities, 
heterogeneity,  and  anisotropy),  and  characteristics  of  the  recording  seismometer.  Since  regions  of  high  seismicity 
are  highly  restricted,  many  stations  are  not  well  situated  to  record  appropriate  events.  An  algorithm  that  incorporates 
multiple  data  types  is  more  widely  applicable  than  one  that  relies  solely  on  a  single  type. 

However,  with  these  advantages  come  some  disadvantages.  For  example,  combining  disparate  data  types  requires 
great  care  in  their  treatment  and  assessment  (Roy  et  al.,  2005).  Benefits  of  additional  data  may  be  null  if  the  method 
used  to  model  them  preferentially  fits  one  type.  Or,  worse,  minimizing  an  inappropriate  criterion  in  conjunction  with 
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incompatible  data  may  “split  the  difference”  between  them  to  choose  a  model  that  is  wholly  inaccurate  and 
inappropriate  for  its  intended  purpose.  If  the  model  is  to  be  used  for  regional  or  local  earthquake  locations,  for 
example,  it  would  be  a  mistake  to  rely  on  the  best  fit  to  surface  wave  dispersion.  The  judgment  and  experience  of 
seismologists  who  keep  a  clear  eye  on  their  goal  is  critical,  and  this  experience  must  be  combined  with  rigor  in  the 
computational  modeling.  This  experience  and  judgment  can  be  incorporated  after  the  fact,  as  is  sometimes  the  case 
with,  for  example,  the  smoothing  applied  to  3D  tomographic  models  but  it  is  usually  better  to  acknowledge  the 
“prior”  explicitly  at  the  outset.  This  is  one  advantage  of  the  Bayesian  approach  that  we  propose  here.  While  priors 
are  sometimes  referred  to  pejoratively  as  “bias”,  their  explicit  statement  during  the  formulation  of  the  modeling 
algorithm  forestalls  serious  criticism  and  enables  a  clear,  quantitative  discussion  of  “bias”. 

Next,  as  a  general  rule  of  thumb,  a  greater  number  of  data  functionals  incorporated  into  modeling  will  result  in  a 
broader  range  of  model  parameter  sensitivities,  and  it  will  be  less  likely  that  a  linear  inversion  approach  will  be 
adequate.  This  is  unfortunate  because  linear  approaches  are  much  more  tractable  and  straightforward  than  nonlinear 
methods.  But  only  a  broad  search  of  the  model  space  will  demonstrate  whether  a  linear  approach  is  valid.  Non-linear 
global  optimization  algorithms  require  no  change  in  the  algorithm  to  include  multi-part  objective  function  with 
different  norms.  A  variety  of  methods  for  nonlinear  inversion  are  now  available  and  the  only  real  cost  for  conducting 
a  thorough  search  and  finding  the  single  best-fitting  model  is  in  computational  effort  and  complexity.  The  downside 
is  that  theoretically  exact  methods  for  assessing  model  uncertainties  and  model  reliability  are  not  generally  tractable, 
and  approximate  methods  result  in  significantly  greater  cost  and  complexity  than  is  required  to  find  the  best-fitting 
model  alone.  Nevertheless,  our  previous  work  has  demonstrated  that  useful  methods  are  indeed  tractable,  and  the 
method  we  propose  will  add  only  marginal  increases  in  computation  time. 

While  surface  wave  dispersion  and  receiver  functions  have  been  modeled  jointly  (Ammon  et  al.,  2005;  Ammon  et 
ah,  2004;  Cakir  and  Erduran,  2004;  Chang  et  al.,  2004;  Dugda  and  Nyblade,  2006;  Herrmann  et  al.,  2001;  Julia  et 
al.,  2000;  Julia  et  al.,  2005;  Lawrence  and  Wiens,  2004;  Ozalaybey  et  al.,  1997;  Tkalcic  et  al.,  2006),  no  study  has, 
to  our  knowledge,  incorporated  waveform  constraints  such  as  S,  Sp,  SsPmP,  and  Shear-coupled  PL.  further,  none 
has  conducted  a  thorough,  nonlinear  assessment  of  the  constraints  provided  by  each  functional. 

Multi-Objective  Optimization  for  Multiple  Datasets 

The  primary  goal  of  our  current  project  is  to  develop  a  tool  for  estimating  crustal  structure  that  can  explain  surface 
wave  dispersion,  receiver  function  and  full  waveform  modeling.  Thus  our  optimization  algorithm  is  expected  to 
minimize  misfit  of  each  of  the  datasets  using  three  different  objective  or  cost  functions.  The  process  of  optimizing 
systematically  and  simultaneously  a  collection  of  objective  functions  is  called  multiobjective  optimization  (MOO)  or 
vector  optimization.  The  general  MOO  is  posed  as  follows: 

Minimize  F(m)  =  [Fi(m),F2(m),...,F;^(m)]^  ,  (2) 

m 

where  m  is  the  model  vector,  and  k  is  the  number  of  objective  functions.  The  objective  function  is  minimized  subject 
to  some  constraints.  In  our  application,  we  use  standard  constraints  such  as  bounds  and  negativity. 

Contrary  to  single-objective  optimization,  an  MOO  may  be  considered  more  of  a  concept  than  a  definition. 

Typically,  with  noisy  data  there  is  no  single  global  solution  and  it  is  often  necessary  to  determine  a  set  of  points  that 
all  fit  a  predetermined  definition  for  an  optimum.  This  is  commonly  done  using  what  is  known  as  Pareto  optimality 
(Pareto  1906)  defined  as  follows. 

Definition:  A  model  m*  belonging  in  the  model  space  is  said  to  be  Pareto  optimal  if  and  only  if  there  does  not  exist 
any  other  model  m  in  the  model  space  such  that  F(m)  <  F(in),  andT}(m)  <  T}(m*)  for  at  least  one  function. 

We  often  use  a  Global  criterion  in  search  for  models  belonging  to  Pareto  set,  which  is  a  scalar  function  that 
mathematically  combines  multiple  objective  functions;  it  may  or  may  not  involve  preference  of  one  or  the  other. 

Objective  Function 

We  use  the  following  objective  function  which  outputs  a  single  scalar  value. 

Error  =  a,  1 1  |  L  +a,  1 1  d'’"*  -  d*^"  | L  +a,  1 1  d'’"^  -  d*^"  1 1  (3) 

^  SPL  SPL  'P  ^  dispersion  dispersion  'P  ^  receiverfucntion  receiverfiicntion 
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This  general  objective  function  allows  for  different  measures  of  error  and  different  weights  to  the  datasets.  There  are 
at  least  three  fundamental  issues  that  we  need  to  address: 

1 .  What  measures  of  error  do  we  use?  Can  they  be  different  for  different  datasets? 

2.  Can  we  use  simple  differences? 

3.  How  do  we  choose  the  three  weights? 

Question  (3)  does  not  generally  have  a  straightforward  answer.  The  weights  have  to  be  determined  by  trial  and  error. 
One  important  consideration  in  defining  the  global  objective  function  for  MOO  is  to  ascertain  that  no  individual 
objective  function  introduces  more  impact  than  the  others  unless  so  desired.  To  address  this,  we  use  the  following 
normalized  objective  function  defined  in  Sen  and  Stoffa  (1995) 


Error  =  1  - 


\a 


^obs  ^syn 


+ 


^obs  ^syn 


(4) 


where  the  sum  is  taken  over  all  the  data  points  and  the  parameter  a  is  equivalent  to  norm. 

Initially  we  chose  a=l  for  all  three  objective  functions.  Note  that  the  SPL  and  receiver  function  make  use  of 
waveform  data  while  the  dispersion  curve  is  fairly  simple.  The  objective  function  corresponding  to  the  surface  wave 
disersion  is  smoothly  varying  and  therefore,  we  chose  a=0.5  to  introduce  more  sensitivity.  With  these  choices  of  a 
values  and  normalized  objective  functions  as  defined  in  the  above  equation,  we  employ  equal  weights  to  all  three 
parts  of  the  objective  function. 

There  are  several  advantages  to  jointly  modeling  multiple  datasets.  First,  each  data  functional  has  unique 
sensitivities  to  Earth  structure.  For  example,  receiver  functions  are  primarily  sensitive  to  shear  wave  velocity 
contrasts  and  vertical  traveltimes  while  surface  wave  dispersion  measurements  are  sensitive  to  vertical  shear  wave 
velocity  averages  (Julia  et  al.,  2000).  Full  waveform  modeling  of  S,  Sp,  SsPmP  phases,  in  contrast,  are  more 
sensitive  compressional  wave  velocity  contrasts  and  vertical  traveltimes;  adding  SPL  to  the  modeling  improves 
senstivity  to  the  uppermost  mantle  shear  wave  structure  and  to  velocity  contrasts  across  the  Moho  (Gangopadhyay  et 
al.,  2007).  Mutually  satisfying  constraints  imposed  by  the  three  datasets  may  constrain  a  larger  subset  of  model 
parameters  than  a  single  set  alone,  resulting  in  a  single  model  that  better  represents  the  true  structure. 

Optimization  and  Parallelization 

Having  defined  the  objective  function,  we  employ  VFSA  for  optimization  (Pulliam  and  Sen,  2005;  Sen  and  Stoffa 
1995).  Note  that  VFSA  causes  no  difficulties  since  we  do  not  require  computing  derivatives.  As  used  in  Pulliam  and 
Sen  (2005),  we  use  a  parallel  optimization  code  for  MOO  as  well.  Note  that  the  most  expensive  computation 
element  is  that  of  SPL  waveform  by  the  reflectivity  method  and  therefore,  the  reflectivity  computation  is  distributed 
over  multiple  nodes.  Since  surface  wave  dispersion  and  receiver  function  computations  are  fairly  fast,  we  distribute 
those  to  one  node  each  and  thus  achieve  reasonable  load  balancing. 

Results  from  a  Synthetic  Experiment 

We  generated  synthetic  S -waveform  for  three  component  seismograms,  surface  wave  dispersion  and  receiver 
function  for  a  simple  six  layer  model.  This  dataset  is  then  used  as  observation  for  optimization.  Our  model’s  search 
parameters  are  Vp,  Vs,  and  layer  thickeness  for  the  six  layers.  We  use  search  bounds  for  the  three  model  parameters, 
as  shown  in  Fig  1(a).  Figures  1,  2,  and  3  show  results  from  our  inversion  experiments  using  three  different 
combination  of  weights,  namely,  (1)  receiver  function  0,  dispersion  0,  SPL  1;  (2)  receiver  function  0,  dispersion  1, 
SPL  1,  and  (3)  receiver  function  1,  dispersion  0,  SPL  1. 

Thus  Figure  3  corresponds  to  the  case  where  all  three  datasets  are  weighed  equally.  The  results  show  a  mean  model 
and  posterior  variances  corresponding  to  different  model  parameters  (see  Pulliam  and  Sen  2005  for  more  details  on 
this).  Note  that  we  run  multiple  VFSA  with  different  starting  models  and  use  all  the  models  to  characterize 
uncertainty.  In  other  words,  we  attempt  to  sample  many  models  from  the  Pareto  set.  The  impact  of  different  weights 
is  fairly  obvious  in  the  three  figures.  It  seems  SPL  data  has  the  most  impact;  by  setting  SPL=I  and  the  other  two  to 
zeros,  we  are  able  to  obtain  reasonable  fit  of  surface  wave  dispersion  and  receiver  function.  When  all  the  weights  are 
set  equal  to  one,  we  are  able  to  match  all  three  datasets  and  reproduce  the  model  fairly  well. 
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CONCLUSIONS  AND  RECOMMENDATIONS 

We  have  developed  a  parallel  VFSA  based  multi-objective  optimization  code  that  can  be  used  to  obtain  crustal 
velocity  structures  by  modeling  SPL  waveform,  receiver  function  and  surface  wave  dispersion  data.  The  code  has 
been  tested  successfully  with  a  simple  synthetic  example.  We  will  continue  to  test  our  code  using  more  realistic 
synthetics  and  will  then  apply  extensively  to  several  datasets  from  the  Middle  East. 
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(a)  (c) 


vertical  ccmpcmeril 


(b) 


Figure  1.  Results  from  joint  inversion  code:  the  weight  for  SPL  data  was  set  equal  to  one;  the  other  two 

weights  are  zero,  (a)  surface  wave  dispersion,  (b)  SPL  waveforms,  (c)  receiver  function,  (d)  model 
fit.  In  figures  (a)-(c),  blue  is  the  observed  data  and  red  the  synthetic  data.  In  figure  (d),  blue  dotted 
lines  are  the  model  bounds;  mean  estimated  model  is  the  red  line  and  the  grey  area  is  the 
uncertainty  in  the  estimated  model. 
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Figure  2.  Results  from  joint  inversion  code:  the  weights  for  SPL  and  receiver  function  data  were  set  equal  to 
one  only,  (a)  surface  wave  dispersion,  (b)  SPL  waveforms,  (c)  receiver  function,  (d)  model  fit.  In 
Figures  (a)-(c),  blue  is  the  observed  data  and  red  the  synthetic  data.  In  Figure  (d),  blue  dotted  lines 
are  the  model  bounds;  mean  estimated  model  is  the  red  line  and  the  grey  area  is  the  uncertainty  in 
the  estimated  model. 
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Figure  3.  Results  from  joint  inversion  code:  all  threee  weights  were  set  equal  to  one.  (a)  surface  wave 

dispersion,  (b)  SPL  waveforms,  (c)  receiver  function,  (d)  model  fit.  In  Figures  (a)-(c),  blue  is  the 
observed  data  and  red  the  synthetic  data.  In  Figure  (d),  blue  dotted  lines  are  the  model  bounds; 
mean  estimated  model  is  the  red  line  and  the  grey  area  is  the  uncertainty  in  the  estimated  model. 
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